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Appl.Na: 10/689,365 
Amdl dated 17 March 2008 

Reply to Examiner- Initiated Interview of 12 March 2008 
Amendments to the Claims: 

This listing of claims will replace all prior versions, and listings, of claims in the application: 

Listing of Claims : 

1 , (cancelled) 

2. (currently amended) The method of claim 1 wherein at loaat ono of balancing at least one 
line of processing e l e monto in o first dimension and balancing at least ono lino of PEs m a n e xt 
dimmaion compris e s T -A method for balancing the work loa d of an n-dtmensional array of 

proc essing elements, wherein each dimension of said airav i ncludes said processing elements 
arranged in a plurality of lines and wherein each of said processing elements has a local number 
of tasks associated therewith, the method comprising: 

balancing a work load across at least one line of processing elements in a first dimension 
by redistributing tasks amongst the processing elements in said line; 

balancing a work load across at least one line of processing elements in a next dimension 
by redistributing tasks amongst the processing elements in said line: and 

repeating said balancing at least one line of processing elements in a next dimension by 
redistributing tasks amongst the processing elements in said line for each dimension of sai d n- 
dimensional array until the work load is balanced across all said processing e lements: and 
wherein said balancinff a work load comprises; 

calculating a total number of tasks for said line, wherein said total number of tasks for 
said line equals the sum of said local number of tasks for each of said processing elements on 
said line; 

notifying each of said processing elements on said line of said total number of tasks 
for said line; 

calculating a local mean number of tasks for each of said processing elements on said 
line; 
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calculating a local deviation from said local mean number for each of said processing 

elements on said line; 

determining a first local cumulative deviation for each of said processing elements on 

said line; 

determining a second local cumulative deviation for each of said processing elements 
on said line; and 

redistributing tasks among said processing elements on said line in response to at 
least one of said first local cumulative deviation and said second local cumulative 
deviation, 

3 . (original) The method of claim 2 wherein two or more lines in at least one of said first 
dimension and said next dimension are balanced in parallel. 

i 

4. (previously presented) The method of claim 2 wherein said calculating a total number of 
tasks for said line comprises sequentially summing said local number of tasks for each of said 
processing elements on said line from a first end of said line to a second end of said line. 

5. (original) The method of claim 2 wherein said calculating said total number of tasks for 

i=N-i 

said line includes solving the equation V = 2 V " where P represents said total number of tasks 

for < laid line, N represents the number of processing elements on said line, and v, represents said 
local number of tasks for a local PE, on said line. 

6. (original) The method of claim 2 wherein said notifying step includes passing said total 
nurr ber of tasks from a second end of said line to a first end of said line. 

7. (previously presented) The method of claim 2 wherein said calculating a local mean 
nurr ber of tasks includes solving the equation M r = Ttunc((V + £ r ) I N) , where M r represents 
said local mean for a local processing element PE r on said line, N represents the total number of 
PEs on said line, Vis the total number of tasks, and E r is a number in the range of 0 to (N-l). 
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8. (previously presented) The method of claim 7 wherein each processing element has a 
diflft rent E r value. 

9. (previously presented) The method of claim 7 wherein said Trunc function is responsive 
to E t . such that said total number of tasks for said line is equal to the sum of the local mean 
number of tasks for each processing element on said line. 

10. (currently amended) The method of claim 7 wherein said local mean 

M f = Trunc((V + £ r )/N) for each local PE r on said line is equal to either X or (X +!),,. where X 
is ecmal to a local mean . 

1 1 . (original) The method of claim 2 wherein said calculating a local deviation for each 
processing element on said line includes finding a difference between said local number of tasks 
for <;ach PE, and said local mean number of tasks for each PE r . 

12. (original) The method of claim 2 wherein said determining a first local cumulative 
deviation includes sequentially summing said local deviations for each PE r from a first end of 
said line to an adjacent upstream PE r _/ on said line, 

* 

13. (original) The method of claim 2 wherein said determining a second local cumulative 
deviation includes finding a difference between the negative of said local deviation for each PE r 
and said first local cumulative deviation for each PE,, 

14. (original) The method of claim 2 wherein said redistributing tasks among said processing 
elements on said line comprises: 

transferring a task from a local PE r to a left-adjacent PE r .; if said first local 
cumulative deviation for said local PE, is a negative value; 

transferring a task from said local PE, to a right-adjacent PE P +/ if said second local 
cumulative deviation for said local PE r is a negative value. 

15. (original) The method of claim 2 wherein said redistributing tasks among said processing 
elements on said line comprises: 
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transferring a task from a local PE r to a left-adjacent PEm if said second local 
cumulative deviation for said local PE r is a positive value; 

transferring a task from said local PE r to a right-adjacent PE r+ / if said first local 
cumulative deviation for said local PE r is a positive value. 

1 6. (original) The method of claim 2 wherein said calculating a local mean number of tasks; 
said calculating a local deviation; said determining a first local cumulative deviation; said 

dete mining a second local cumulative deviation; and said redistributing tasks are completed in 
parallel for each processing element on said line. 

1 7. (original) The method of claim 1 6 wherein said calculating a local mean number of 
task;; said calculating a local deviation; said determining a first local cumulative deviation; said 
dete rmining a second local cumulative deviation; and said redistributing tasks are completed in 
parallel for each line in a selected dimension. 

18. (cancelled) 

19. (original) The method of claim 2 wherein said calculating a local deviation, said 
determining a first local cumulative deviation, said determining a second local cumulative 
deviation, and said redistributing tasks among said processing elements are repeated until said 
local deviation, said first local cumulative deviation, and said second local cumulative deviation 
for each of said processing elements is zero. 

20. (currently amended) A method for balancing a work load across one dimension of an n- 
dimcnsional array of processing elements, wherein each of said n-dimensions is traversed by a 
plunility of lines and wherein each of said lines has a plurality of processing elements with a 
loca number of tasks associated therewith, the method comprising: 

balancing said plurality of lines in one dimension by redistributing tasks amongst the 
processing elements in each of said plurality of said lines[[J]; 
balancing said plurality of lines in a next higher dimension: 

repeating said balancing said plurality of lines in a next higher dimension for each 
remaining dimension of said n-dimensional array, wherein each of said balanced lines 
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includes PEs with either a number of local tasks equal to em-ei X or a number of local 
number - of tasks equal to m4 (X+ 1 V fo e al - - number of task s , where X equals a local mean : 
substituting the value zero (0) for each processing element having X local number of 
tasks; 

substituting the value one (1 ) for each processing element having (X+l) local number 
of tasks; and 

shifting said values for each processing element within said balanced lines until a sum 
of said processing elements relative to a second dimension has only two values. 

21, (cancelled) 

22, (previously amended) The method of claim 20 wherein said balancing said plurality of 
lines*, in one dimension comprises: 

calculating a total number of tasks present within at least one of said lines; 

notifying each processing element on said line of said total number of tasks for said . 

* 

line; 

determining each processing element's share of said total number of tasks on said 
line; 

calculating a local deviation from said previous steps; 

determining a first local cumulative deviation for each processing element on said 
line using said local deviation; 

determining a second local cumulative deviation for each processing element on said 
line using said local deviation; 

redistributing tasks among each processing element on said line in response to at least 
one of said first local cumulative deviation and said second local cumulative deviation, 

23, (original) The method of claim 22 wherein said notifying each processing element 
comprises: 

serially summing said total number of tasks present on said line; and 
transmitting said total number of tasks to each processing clement on said line 
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24. (original) The method of claim 22 wherein said determining each processing element's 
sharo of said total number of tasks comprises: 

calculating a local mean number of tasks for each processing element on said line; 

and 

calculating a local deviation from said local mean number of tasks for each 
processing element on said line by finding the difference between said local number of 
tasks and said local mean number of tasks for each processing element on said line. 

25. (previously presented) The method of claim 24 wherein said calculating a local mean 
number of tasks for each processing element on said line comprises using a rounding function 
M Y ~ Trunc((V + E r ) / N) , where M r represents said local mean of a local processing element 
PE^ N represents the total number of processing elements on said line, Vis the total number of 
task:;, and E r represents a number in the range of 0 to (jV-1). 

26. (previously presented) The method of claim 25 wherein said Trunc function is responsive 
to E, such that said total number of tasks for said line is equal to the sum of the local mean 
number of tasks for each of said processing elements in said line. 

27. (currently amended) The method of claim 25 wherein said local mean 

M r = Trunc((V + E r )/ N) for each local processing element on said line is equal to either X or 

(X+ ! . \ where X is equal to a local mean . 

28. (original) The method of claim 22 wherein said determining a first local cumulative 
deviation for each processing element on said line includes summing said local deviations for 
each upstream processing element on said line. 

29. (original) The method of claim 22 wherein said determining a second local cumulative 
deviation for each processing element on said line includes finding the difference between the 
negative of said local deviation and said first local cumulative deviation for each processing 
element on said line. 
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30. (original) The method of claim 22 wherein said redistributing tasks among each 
processing element on said line in response to at least one of said first local cumulative deviation 
and said second local cumulative deviation comprises: 

transferring a task from a first processing element on said line to a second processing 
element on said line if said first local cumulative deviation for said first processing 
element is a negative value; and 

transferring a task from said second processing element on said line to said first 
processing element on said line if said first local cumulative deviation for said second 
processing element is a positive value. 

3 1 . (original) The method of claim 22 wherein said redistributing tasks among each 
processing element on said line in response to at least one of said first local cumulative deviation 
and ;>aid second local cumulative deviation comprises: 

transferring a task to a first processing element on said line from a second processing 
element on said line if said second local cumulative deviation for said first processing 
element is a negative value; and 

transferring a task to said second processing element on said line from said first 
processing element on said line if said second local cumulative deviation for said second 
processing element is a positive value. 

32. (original) The method of claim 24 wherein said calculating a local deviation, said 
detei mining a first local cumulative deviation* said determining a second local cumulative 
deviation, and said redistributing tasks among said processing elements are repeated until said 
local deviation, said first local cumulative deviation, and said second local cumulative deviation 
for each of said processing elements is zero. 

33. (cancelled) 

34. (new) A computer memory storing a set of instructions which, when executed, perform a 
method for balancing a work load across one dimension of an n-dimensional array of processing 
elements, wherein each of said n-dimensions is traversed by a plurality of lines and wherein each 
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of said lines has a plurality of processing elements with a local number of tasks associated 
therewith, the method comprising: 

balancing said plurality of lines in one dimension by redistributing tasks amongst the 
processing elements in each of said plurality of said lines, 

balancing said plurality of lines in a next higher dimension; 

repeating said balancing said plurality of lines in a next higher dimension for each 
remaining dimension of said n-dimensional array, wherein each of said balanced lines 
includes PEs with either a number of local tasks equal to X or a number of local equal to 
(X+l), where X equals a local mean; 

substituting the value zero (0) for each processing element having X local number of 
tasks; 

substituting the value one (1) for each processing element having (X+l) local number 
of tasks; and 

shifting said values for each processing element within said balanced lines until a sum 
of said processing elements relative to a second dimension has only two values* 
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